Where and What
نویسندگان
چکیده
Human drivers use their attentional mechanisms to focus on critical objects and make decisions while driving. As human attention can be revealed from gaze data, capturing analyzing information has emerged in recent years benefit autonomous driving technology. Previous works this context have primarily aimed at predicting "where" look lack knowledge of "what" on. Our work bridges the gap between pixel-level object-level prediction. Specifically, we propose integrate an prediction module into a pretrained object detection framework predict grid-based style. Furthermore, are recognized based predicted attended-to areas. We evaluate our proposed method two driver datasets, BDD-A DR(eye)VE. achieves competitive state-of-the-art performance both but is far more efficient (75.3 GFLOPs less) computation.
منابع مشابه
Baryons: What, When and Where?
We review the current state of empirical knowledge of the total budget of baryonic matter in the Universe as observed since the epoch of reionization. Our summary examines on three milestone redshifts since the reionization of H in the IGM, z = 3, 1, and 0, with emphasis on the endpoints. We review the observational techniques used to discover and characterize the phases of baryons. In the spir...
متن کاملWhat Went Where
We present a novel framework for motion segmentation that combines the concepts of layer-based methods and featurebased motion estimation. We estimate the initial correspondences by comparing vectors of filter outputs at interest points, from which we compute candidate scene relations via random sampling of minimal subsets of correspondences. We achieve a dense, piecewise smooth assignment of p...
متن کاملWhere-What Network 1: “Where” and “What” Assist Each Other Through Top-down Connections
This paper describes the design of a single learning network that integrates both object location (“where”) and object type (“what”), from images of learned objects in natural complex backgrounds. The in-place learning algorithm is used to develop the internal representation (including synaptic bottomup and top-down weights of every neuron) in the network, such that every neuron is responsible ...
متن کاملStacked What-Where Auto-encoders
We present a novel architecture, the “stacked what-where auto-encoders” (SWWAE), which integrates discriminative and generative pathways and provides a unified approach to supervised, semi-supervised and unsupervised learning without relying on sampling during training. An instantiation of SWWAE uses a convolutional net (Convnet) (LeCun et al. (1998)) to encode the input, and employs a deconvol...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ACM on human-computer interaction
سال: 2022
ISSN: ['2573-0142']
DOI: https://doi.org/10.1145/3530887